冬来冬往的博客

Encoder-Decoder Architecture

2023-07-17

What is it?

Encoder-Decoder architecture is a sequence-to-sequence architecture. This means that it takes, for example a sequence of words as input, and output a sequence of text.
It is a machine that consumes sequences and splits out sequences.

How does it do it?

Encoder

Produces a vector representation of the input sentence.

A recurrent(周期的) neural network(RNN) encoder takes each token in the input sequence one at a time and produces a state representing this token as well as all the previously ingested(获取) tokens.
Then the state is used in the next encoding step as input along with the next token to produce the next state.
Once you are done ingesting all the input tokens into the RNN, you output a vector that essentially represents the full input sentence.

Decoder

Creates the sequence output.

The decoder takes the vector representation of the input sentence and produces an output sentence from that representation.
In the case of an RNN decoder, it does it in steps, decoding the output one token at a time using the current state and what has been decoded so far

Transformer

The simple RNN network is replaced by transformer blocks, which is based on the attention mechanism.

Tags: AI

扫描二维码,分享此文章